Genome-Wide Search for Local DNA Segments with Anomalous GC-Content

نویسندگان

  • Andrey V. Ilatovskiy
  • Michael Petukhov
چکیده

An anomalous (i.e., significantly different from genome-average) GC-content is often used as one of the markers to reveal the events of horizontal gene transfer (HGT). Unfortunately, results obtained by the traditional fixed-length window analysis strongly depend on an arbitrary selection of DNA window length. Here we present a new method for genome-wide statistical analysis of GC-content without that drawback. The method is based on a set of nonparametric statistical tests and is capable of providing reliable estimations of both a local and global GC-content, and thus can identify small local areas (as short as 30 bp) with anomalous GC-content in a bacterial genome. The tests, applied to a well-studied bacterial genome of Escherichia coli K-12, show that approximately 21% of the genome belongs to the anomalous GC-content areas. Among top 23 anomalous GC-content areas, seven correspond to the annotated prophages, four to Rhs elements, and two to IS elements. A remaining 10 areas contain putative horizontally transferred DNA and genes with still unknown functions. Software is available at http://mml.spbstu.ru/gcstat.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exponential decay of GC content detected by strand-symmetric substitution rates influences the evolution of isochore structure.

The distribution of guanine and cytosine nucleotides throughout a genome, or the GC content, is associated with numerous features in mammals; understanding the pattern and evolutionary history of GC content is crucial to our efforts to annotate the genome. The local GC content is decaying toward an equilibrium point, but the causes and rates of this decay, as well as the value of the equilibriu...

متن کامل

Improved PCR Amplification of Broad Spectrum GC DNA Templates

Many applications in molecular biology can benefit from improved PCR amplification of DNA segments containing a wide range of GC content. Conventional PCR amplification of DNA sequences with regions of GC less than 30%, or higher than 70%, is complex due to secondary structures that block the DNA polymerase as well as mispriming and mis-annealing of the DNA. This complexity will often generate ...

متن کامل

High G+C Content of Herpes Simplex Virus DNA: Proposed Role in Protection Against Retrotransposon Insertion

The herpes simplex virus dsDNA genome is distinguished by an unusually high G+C nucleotide content. HSV-1 and HSV-2, for instance, have GC contents of 68% and 70% respectively, while that of the host (human) genome is 41%. To determine how GC content varies with genome location, GC content was measured separately in coding and intergenic regions of HSV-1 DNA. The results showed that the 75 gene...

متن کامل

Evolutionary Consequences of DNA Methylation on the GC Content in Vertebrate Genomes

The genomes of many vertebrates show a characteristic variation in GC content. To explain its origin and evolution, mainly three mechanisms have been proposed: selection for GC content, mutation bias, and GC-biased gene conversion. At present, the mechanism of GC-biased gene conversion, i.e., short-scale, unidirectional exchanges between homologous chromosomes in the neighborhood of recombinati...

متن کامل

Multiscale DNA partitioning: statistical evidence for segments

MOTIVATION DNA segmentation, i.e. the partitioning of DNA in compositionally homogeneous segments, is a basic task in bioinformatics. Different algorithms have been proposed for various partitioning criteria such as Guanine/Cytosine (GC) content, local ancestry in population genetics or copy number variation. A critical component of any such method is the choice of an appropriate number of segm...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of computational biology : a journal of computational molecular cell biology

دوره 16 4  شماره 

صفحات  -

تاریخ انتشار 2009